Introduction

At A Glance

{accessrmd} is currently in development. If you discover bugs or improvements, please review the code of conduct and contribute on GitHub.

{accessrmd} is a package written to help improve the accessibility of Rmarkdown documents. The standard Rmarkdown outputs have HTML structural issues that result in problems for people using screen readers. The purpose of {accessrmd} is to help developers in writing accessible Rmarkdown documents and in converting a back catalogue of documents in need of accessibility amendments.

{accessrmd} is currently limited to html_output only. It is not available on CRAN yet, but this is the aim once the first release has been published.


What problem does it solve?

The HTML structure of standard rmarkdown outputs are not AA WCAG2.1 standard. AA is the standard required by all UK government digital services. In order to present HTML checks, I will be enlisting the help of the excellent, open-source WAVE accessibility tool. It doesn’t catch everything required for AA-compliance, but it’s a great way to get started with an accessibility audit, including helpful explanations for newcomers to WCAG 2.1 compliance.

The below image shows the output of a WAVE check on the standard Rmarkdown html output. As you can see, there are a number of errors and warnings. Click the image for more detail.

Accessibility check of a standard rmarkdown output showing errors & warnings. Click the image to view the full check on wave.webaim.org, opens in new window.

By executing a few functions from the {accessrmd} package, the html format issues can be easily remedied, without the developer needing to write any HTML. Please observe the output of an Rmarkdown which has been adjusted by {accessrmd} functions (again, you can click for an interactive check):

Accessibility check of an rmarkdown output modified by acessrmd, showing no errors or warnings. Click the image to view the full check on wave.webaim.org, opens in new window.


Github Actions

A continuous deployment workflow has been employed in the development of {accessrmd}, using GitHub Actions.

This workflow has allowed efficient adaptation of the package modules while ensuring the integrity of the outputs and conditional behaviours.

CD workflows for this package include:

  • CRAN build checks.
  • Test coverage with Codecov.
  • Automated linting.

R CMD Build Check

R build status

An automated test suite analagous to the checks run by CRAN on package submission.

This suite of checks is something that I tend to execute as part of my development practice. However, setting automated checks on push to the remote ensures that human error is mitigated. Collaborators or developers wishing to install the development version of {accessrmd} can be informed of the current state of the repository and assured of the package’s functionality.

As can be seen from the screenshot below, the package is tested for compatibility with three different operating systems, Windows, MacOS and 2 flavours of Ubuntu.

GitHub operating system check manifest

Clicking on an item in the check manifest allows you to view the detailed check schedule. If any checks failed, you can consult the log to help pinpoint the error.

Detailed test schedule for Windows
latest release.


Test Coverage With Codecov

Codecov test coverage

I utilise Test-Driven Development when writing software in order to mitigate against misuse cases.

@tired_actor

##duet with @brock1137 no please no ##funny ##Welcome2021 ##2021 ##newyear ##firstpost ##viral ##fyp ##foryou ##foryoupage ##comedy ##crying ##2020 ##RareAesthetic

♬ The Square Hole - Brock

All jokes to the side, ensuring that your functions are safe is vital. This helps mitigate against poor documentation and misinterpretation.

Code coverage gives an indication of the percentage of code lines that are exercised when the test battery is run. However, I would advise caution in assuming that a high coverage equates to quality software. It tends to be very easy to write a suite of tests that result in high coverage. This does not mean that all necessary exception handling has been considered and tested.

Thorough and well-considered test conditions are excellent tools in assuring quality software. It also means that you can turn your failures into future successes - every bug you encounter can be converted to a meaningful test.

Below, you can view the test coverage workflow on pushing to the remote.

Viewing the automated coverage checks on GitHub

Linting

Automated linting using the {lintr} package with GitHub actions is used to help ensure code adheres to the tidyverse style guide. Code readability is important in assisting collaboration. Data Science Campus guidance is to apply the tidyverse style guide.

The accessrmd lint workflow on GitHub

Perhaps more important than the adopted style is to ensure that code is conscientiously commented. Your colleagues and future self will value the effort taken in explaining your code.


More Projects By The Author

GitHub Projects

‘ptspotter’ Package

CRAN status R build status Codecov test coverage CRAN RStudio mirror downloads

‘ptspotter’ is a package that aims to simplify some of the mundane tasks involved in setting up an analytical pipeline using the fantastic ProjectTemplate framework.

ptspotter is available on CRAN.

ptspotter blog.


Highways England Traffic Data

Road data application interface.

This data pipeline was redeveloped in December 2020 as Brexit approached. It has been an important source of road sensor data to help inform the movement of traffic around English ports. This redevelopment was requested by the faster economic indicators team and the Chief Economist.

The redeveloped pipeline features the querying of the Highways England RESTful webTRIS api in parallel using a virtual environment. Additional value beyond the MVP was added with a Shiny UI to add data validation and an automated Rmarkdown report, adding insight on the proportion of site types returning null responses.

Code on GitHub

Documentation on GitHub Pages


DSC GitHub Scrape

Code on GitHub

In the Summer of 2020 I had been assigned to develop a solution that would ensure the team’s external website reflected the course status on GitHub. As there was much activity in course QA in our backlog, the status of our website would become out of date often.

The pipeline I developed interacts with the GitHub REST API to return all company repositories prefixed with “DSCA_” (capability repositories). In collaboration with the campus faculty, I developed a standardised readme for all courses that required metadata fields could be scraped from.

The resulting table of course repos and metadata is cached on a monthly basis to rds. This is then used to detect changes in course catalogue state. If the number of course repos changes or a course version is incremented, an automated Email using the Gmail API is sent to our internal comms team with a summary of the change and an attached csv file including the updated course catalogue. This routine is scheduled on a monthly basis using cron.

Automated Email generated by the course catalogue pipeline.


Shiny Applications

These applications are presented in reverse order of publication. While the contexts of the applications vary, they help to illustrate an increasing maturity in Shiny development.

Earth Observation Wildfires

I am currently involved in a project with DSC colleagues in remote sensing wildfires. We all attended the Summer 2021 NASA advanced remote sensing training and have started development in Python & R, identifying & classifying fire risk from satellite rasters.

In order to assist the team’s exploratory work and model evaluation, I have written an app that allows the developer full control over the NASA Fire Information for Resource Management System (FIRMS) API. The application uses authenticated HTTP requests via Web Map Service standard to ingest FIRMS tiles.

FIRMS app user interface.

The developers can use the various inputs available within the collapsible panes to adjust the query parameters, updating the API response and adjusting the resultant map appearance.

FIRMS UI collapsible panes for inputs.

Unifyr

This is an educational application designed to help trainee data scientists in developing their understanding of the different data join functions available within the {dplyr} package. Unifyr allows the learner to select subsets from the well-known gapminder dataset and observe the output of specified join functions.

Unifyr application interface. Click the image to see Unifyr on shinyapps.io, opens in new window.


Google Mobility Data

This Google Mobility Data application was developed at the start of the Covid-19 pandemic using data derived by the optical processing of pdf data publications.

The application allows the user to select specified local authority or NHS bodies in order to view the time series mobility data.

Google mobility application interface. Click the image to see Google mobility data on shinyapps.io, opens in new window.


Welsh School Funding

This application combines 2 open datasets published by Welsh Government: Budgeted educational revenue and outturn expenditure. The available data dimensions may be selected from to plot upon the chart axes by educational phase. Time series data for all available schools and data dimensions may also be viewed upon the second tab.

Schools analysis application interface. Click the image to see Welsh school funding data on shinyapps.io, opens in new window.


Common Welsh Place Names

This app was a bit of fun and took very little time. It used some basic language processing and geolocation to plot Welsh location names with common prefixes such as “Cwm” or “Aber” on a map. The application used open data published by the Welsh Language Commissioner who subsequently approached me and were very interested in the project, leading me to sharing the codebase with them.

Common Welsh place names application interface. Click the image to see common Welsh place names on shinyapps.io, opens in new window.


Cardiff Tide Levels

This repo visualises LiDar altitudinal data of Cardiff, openly published by Natural Resources Wales. Cardiff was selected due to comparatively high data quality and population density. The visualisations compare current day high tide altitude with a modelled 2 metre sea rise by the year 2100.

Cardiff sea levels map using LiDar data. Click the image to see common Cardiff high tide levels on shinyapps.io, opens in new window.


RPubs

Remote Sensing Trees

In 2018 I had been approached by a colleague to discuss the feasibility of using satellite data to help estimate the quantity of woodland stock in a defined rural area. The report documents my exploration work done in this area using open source geospatial frameworks and satellite rasters. The work pointed to a promising avenue of analysis but also indicated limitations in the use of open source landsat imagery. At the time I recommended the procurement of higher resolution remote imagery and verification of tree counts using ground truth observational work.

Satellite earth observation of Wales. Click the image to see the counting trees from space exploratory report on RPubs, opens in new window.


Welsh Schools Funding ’18/19

This {flexdashboard} user interface was one of my first pieces of work in application development. The dashboard is an adapted RMarkdown output and can very easily be reproduced with some basic data manipulation and markdown syntax. Furthermore, the product is self-contained and can be Emailed and freely shared, unlike a Shiny application which requires a connection to R and a shiny server.

The dashboard presents Budgeted Education Revenue for financial years 2018/19. An interactive map of geolocated schools in Wales is presented with informative tooltips highlighting the data dimensions. In additional tabs, linear regression models of pupil numbers against delegated budgets for each educational phase are presented.

Map of Welsh secondary school locations. Click the image to see the Welsh school funding dashboard on RPubs, opens in new window.